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DETAILED ACTION 
Continued Examination Under 37 CFR 1.114 

1 . A request for continued examination under 37 CFR 1.114, including the fee set forth in 
37 CFR 1 .17(e), was filed in this application after final rejection. Since this application is 
eligible for continued examination under 37 CFR 1.1 14, and the fee set forth in 37 CFR 1.17(e) 
has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 
37 CFR 1.1 14. Applicant's submission filed on November 16, 2009 has been entered. 

Response to Arguments 

2. Applicant's arguments with respect to claims 1-18 and 23 have been considered but are 
moot in view of the new ground(s) of rejection. 

3. Applicanf s arguments filed November 16, 2009, with respect to Claim 33 have been fully 
considered but they are not persuasive. As per Claim 33, Applicant argues that Joffe 
(US006330584B1) does not describe checking to see if all of the programs are completed (p. 10). 
Joffe teaches that this is done for a single program, and not for all of the plurality of programs (p. 
11). 

In reply, the Examiner points out that Joffe describes "If a task attempts to access an 
unavailable resource, the task is suspended. . .When the resource becomes available, the 
suspended task is resumed, and the instruction accessing the resource is re-executed... the task 
does not get access to the same resource until after every other task sharing the resource has 
finished accessing the resource" (col. 2, lines 29-39). If Wait signal is asserted, instruction 
execution is not completed and PC register is frozen, but task remains active, and instruction will 
be executed again starting next clock cycle. If Suspend/Wait signals are deasserted, PC register is 
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changed to point to next instruction (col. 10, lines 19-32). So, Joffe teaches checking to see if 
Suspend/Wait signals are desasserted, which indicates instruction execution is completed (col. 
10, lines 19-32) and resource has become available (col. 2, lines 32-34), and resource becomes 
available after every other task sharing resource has finished accessing resource (col. 2, lines 29- 
39). Since the resource only becomes available after every other task sharing the resources have 
finished accessing the resource, this means that Joffe teaches checking to see if all of the 
programs are completed. 

Claim Rejections - 35 USC §103 

4. The following is a quotation of 35 U.S. C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set forth in 
section 102 of this title, if the differences between the subject matter sought to be patented and the prior art are 
such that the subject matter as a whole would have been obvious at the time the invention was made to a person 
having ordinary skill in the art to which said subject matter pertains. Patentability shall not be negatived by the 
manner in which the invention was made. 

5. The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 
(1966), that are applied for establishing a background for determining obviousness under 35 
U.S.C. 103(a) are summarized as follows: 

1 . Determining the scope and contents of the prior art. 

2. Ascertaining the differences between the prior art and the claims at issue. 

3. Resolving the level of ordinary skill in the pertinent art. 

4. Considering objective evidence present in the application indicating obviousness 
or nonobviousness. 

6. Claims 1, 2, 4, 7, 9, and 10 are rejected under 35 U.S.C. 103(a) as being unpatentable 
over Kohn (US005261063A) and Davis (US005357617A). 

7. As per Claim 1, Kohn teaches a programmable processor for executing a plurality of 
programs, said programmable processor comprising: an execution pipeline having a depth of a 
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plurality of program instruction execution stages (eight) and a depth less than or equal to said 
plurality of programs (eight) wherein each of the plurality of programs comprises a plurality of 
instructions; and an interleaver for interleaving instructions from said plurality of programs and 
providing said instructions to said pipeline for execution such that the number of said plurality of 
programs that are interleaved (eight) is greater than or equal to the depth of the pipeline (eight) 
(col. 1, line 64-col. 2, line 14). 

However, Kohn does not expressly teach that the plurality of programs have an average 
pipeline latency of one instruction per cycle. However, Davis describes "the average throughput 
is one instruction per machine cycle because of the overlapped operations of three pipeline 
phases" (col. 1, lines 28-30) and "A n-phase pipeline is created by pipelining slower circuits so 
that parts of two different instructions are flowing through them (e.g., one instruction in a first 
cycle and another instruction in a second cycle)" (col. 7, lines 30-34). Thus, Davis teaches that 
the plurality of programs have an average pipeline latency of one instruction per cycle. 

It would have been obvious to one of ordinary skill in the art to modify Kohn so that the 
plurality of programs have an average pipeline latency of one instruction per cycle because Davis 
suggests that by overlapping operations of the pipeline phases and pipelining slower circuits so 
that parts of two different instructions are flowing through them so that the average pipeline 
latency is one instruction per cycle is advantageous because this improves the processing speed 
(col. 1, lines 28-32; col. 7, lines 30-34). 

8. As per Claim 2, Kohn teaches wherein said pipeline has a datapath with a depth (eight) 
equal to said number of programs (eight) (col. 1, line 64-col. 2, line 14). 
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9. As per Claim 4, Kohn teaches wherein each program of said plurality of programs is 
independent of the other of said plurality of programs (col. 4, lines 45-46; col. 16, lines 20-22). 

10. As per Claim 7, Kohn teaches wherein one or more of control and computing resources, 
instructions, instruction memory, data paths, data memory, and caches are shared by said 
plurality of programs (col. 2, lines 51-63). 

11. As per Claim 9, Kohn teaches wherein said instructions comprise load instructions for 
loading data from a data memory (col. 8, lines 16-18). 

12. As per Claim 10, Kohn teaches storing the calculated result in data store 12 (col. 6, lines 
20-22), and this is inherently an instruction for performing this storing. Thus, Kohn teaches 
wherein said instructions comprise store instructions for storing data in a memory (col. 6, lines 
20-22). 

13. Claims 3, 5, 6, and 1 1 are rejected under 35 U.S.C. 103(a) as being unpatentable over 
Kohn (US005261063A) and Davis (US005357617A) in view of Eickemeyer (US006061710A). 

14. As per Claim 3, Kohn and Davis are relied upon for teachings as discussed above relative 
to Claim 1. 

However, Kohn and Davis do not teach wherein a next instruction from one of said 
plurality of programs is not provided to said pipeline until a previous instruction of said one of 
said plurality of programs has completed and in the meantime an instruction from another 
program is being executed by said pipeline. However, Eickemeyer teaches wherein a next 
instruction from one of said plurality of programs is not provided to said pipeline until a previous 
instruction of said one of said plurality of programs has completed (col. 3, lines 41-48) and in the 
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meantime an instruction from another program is being executed by said pipeline (col. 4, lines 
14-25). 

It would have been obvious to one of ordinary skill in the art at the time of invention by 
applicant to modify Kohn and Davis so that a next instruction from one of said plurality of 
programs is not provided to said pipeline until a previous instruction of said one of said plurality 
of programs has completed and in the meantime an instruction from another program is being 
executed by said pipeline as suggested by Eickemeyer. Eickemeyer suggests that instructions 
dependent upon the results of a previously dispatched instruction that has not yet completed 
causes the pipeline to stall. For instance, instructions dependent on a load/store instruction in 
which the necessary data is not in the cache cannot be executed until the data becomes available 
in the cache. Allowing out-of order completion so that an instruction from another program to 
be executed by said pipeline in the meantime enables the pipeline to do useful work when a 
pipeline stall condition is detected instead of being idle and not accomplishing any work while 
waiting (col. 3, lines 37-63; col. 4, lines 14-25). 

15. As per Claim 5, Kohn does not teach further including an output buffer for storing out of 
order data output. However, Eickemeyer teaches further including an output buffer for storing 
out of order data output (col. 8, line 43-col. 9, line 6). 

It would have been obvious to one of ordinary skill in the art at the time of invention by 
applicant to modify Kohn to include an output buffer for storing out of order data output because 
Eickemeyer suggests that an output buffer is needed in order to reorder the instructions so that 
they are out of order (col. 8, line 43-col. 9, line 6), and it is advantageous to output data out of 
order for the same reasons given in the rejection for Claim 3. 
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16. As per Claim 6, Kohn does not teach further including one or more of a register copy, 
program counter, and program counter stack provided for each of said plurality of programs. 
However, Eickemeyer teaches further including one or more of a register copy, program counter, 
and program counter stack provided for each of said plurality of programs (col. 8, lines 57-63). 

It would have been obvious to one of ordinary skill in the art at the time of invention by 
applicant to modify Kohn to include one or more of a register copy, program counter, and 
program counter stack provided for each of said plurality of programs because Eickemeyer 
suggests that that a register copy and a program counter are needed in order to ensure that the 
thread is executing the correct or desired branch path (col. 8, line 43-col. 9, line 6). 

17. As per Claim 11, Kohn does not teach wherein said data memory comprises a cache. 
However, Eickemeyer teaches wherein said data memory comprises a cache (col. 10, lines 24- 
26). 

It would have been obvious to one of ordinary skill in the art at the time of invention by 
applicant to modify Kohn so that said data memory comprises a cache because Eickemeyer 
suggests that cache memories store frequently used and other data nearer the processor and allow 
instruction execution to continue without waiting the full access time of a main memory (col. 3, 
lines 21-25). 

18. Claim 8 is rejected under 35 U.S. C. 103(a) as being unpatentable over Kohn 
(US005261063A) and Davis (US005357617A) in view of Nguyen (US005961628A). 

Kohn and Davis are relied upon for the teachings as discussed above relative to Claim 1 . 
However, Kohn and Davis do not teach wherein said processor executes SIMD vector 
instructions of vector length N and executes in parallel a plurality of instructions having SIMD 



Application/Control Number: 09/625,812 Page 8 

Art Unit: 2628 

vector lengths that sum up to N. However, Nguyen teaches processor executes SIMD vector 
instructions of vector length N and executes in parallel a plurality of instructions having SIMD 
vector lengths that sum up to N (col. 1, lines 1 1-24, 53-60). 

It would have been obvious to one of ordinary skill in the art at the time of invention by 
applicant to modify Kohn and Davis to include executing in parallel instructions having SIMD 
vector lengths that sum up to N because Nguyen teaches fast speed for repetitive tasks (col. 1, 
lines 10-25). 

19. Claims 12 and 13 are rejected under 35 U.S. C. 103(a) as being unpatentable over Kohn 
(US005261063A) and Davis (US005357617A) in view of Narayanaswami (US005973705A). 

Kohn and Davis are relied upon for the teachings as discussed above relative to Claim 9. 

However, Kohn and Davis do not teach address space of data memory has frame buffer 
unit and texture memory unit. But, Narayanaswami teaches SIMD graphics processing system 
having frame buffer unit (frame buffer 1 lOf, Fig. 2A) while implicitly suggesting texture 
memory unit. 

It would have been obvious to one of ordinary skill in the art at the time of invention by 
applicant to modify Kohn and Davis so address space of data memory has frame buffer unit and 
texture memory unit because Narayanaswami teaches it reduces processing time (col. 2, lines 20- 
22). 

20. Claims 14, 16, 17, and 23 are rejected under 35 U.S.C. 103(a) as being unpatentable over 
Kohn (US005261063A), Davis (US005357617A), and Krishna (US006161 173A). 

21 . As per Claim 14, Kohn teaches a method of executing instructions from a plurality of 
programs comprising: identifying N programs of said plurality of programs wherein each of the 
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plurality of programs comprises a plurality of instructions; interleaving instructions from said N 
programs in a processor pipeline wherein said pipeline has a depth of a plurality of program 
instruction execution stages; and wherein an instruction from another of said N programs has 
been interleaved while said first instruction is executing (col. 1, line 64-col. 2, line 14). 

However, Kohn does not expressly teach that said pipeline has an average latency of one 
instruction per cycle. However, Davis teaches this limitation, as discussed in the rejection for 
Claim 1. 

However, Kohn and Davis do not teach executing said instructions such that a first 
instruction from one of said N programs is completed before beginning execution of a second 
instruction of said one of said N programs wherein no no-op or idle is inserted into the pipeline 
for the purpose of ensuring that said first instruction is completed before beginning execution of 
said second instruction. However, Applicant's disclosure describes inserting no-ops into 
instruction stream or retarding launching of new programs until first program finishes (p. 17, 
lines 8-11). So, when no no-op is inserted, that means first instruction is completed and 
execution of second instruction can begin. Krishna teaches local scheduling circuitry stops main 
scheduler from issuing selected operation if latency of another operation would create conflict 
with main scheduler issuing selected operation (col. 2, lines 56-60). So, it is ensured first 
instruction is completed before beginning execution of second instruction. So, next instruction is 
not provided to the pipeline until a previous instruction has completed. Operation is executed if 
no no-op is inserted into pipeline (col. 5, lines 36-38; col. 2, lines 41-45). So, when no no-op is 
inserted into pipeline, this ensures first instruction is completed before beginning execution of 
second instruction. So, when no-op is inserted, this means that operation is not ready to be 
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executed. When associated operation which is to be executed is inserted, there is no no-op 
inserted, this means that associated operation is ready to be executed, meaning first instruction is 
completed and it is now okay to execute associated operation. So, no no-op or idle is inserted for 
purpose of ensuring first instruction is completed before beginning execution of second 
instruction. 

It would have been obvious to one of ordinary skill in the art at the time of invention by 
applicant to modify Kohn and Davis so next instruction from one of plurality of programs is not 
provided to pipeline until previous instruction of one of plurality of programs has completed 
because Krishna suggests sometimes latency of another instruction would create conflict with 
main scheduler issuing selected instruction, so in order to avoid this conflict, next instruction is 
not provided to pipeline until previous instruction has completed (col. 2, lines 56-60). It would 
be obvious to modify Kohn and Davis to include checking no no-op or idle is inserted into 
pipeline for ensuring next instruction is not provided to pipeline until previous instruction has 
completed because Krishna suggests no-op is needed for indicating previous instruction has not 
yet completed (col. 5, lines 26-38). 

22. As per Claim 16, Kohn does not expressly teach further including the step of assigning a 
register to each of said N programs. However, Davis teaches further including the step of 
assigning a register to each of said N programs (col. 2, lines 25-28). 

It would have been obvious to one of ordinary skill in the art at the time of invention by 
applicant to modify Kohn to include the step of assigning a register to each of said N programs 
because Davis suggests that this makes it easier for the multiple instruction threads to be 
separately handled substantially concurrently (col. 2, lines 25-41). 
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23. As per Claim 17, Kohn teaches wherein said processor pipeline has a depth of N (eight) 
(col. 1, line 64-col. 2, line 14). 

As per Claim 23, Kohn teaches a programmable processor for executing a plurality of 
programs, said programmable processor comprising: an execution pipeline having a depth of a 
plurality of program instruction execution stages (eight) and a depth less than or equal to the 
plurality of programs (eight) wherein each of the plurality of programs comprises a plurality of 
instructions; and an interleaver for interleaving instructions from said plurality of programs and 
providing said instructions to said pipeline for execution and wherein an instruction from another 
of said N programs has been interleaved while said first instruction is executing (col. 1, line 64- 
col. 2, line 14). 

However, Kohn does not expressly teach that the execution pipeline has an average 
pipeline latency of one instruction per cycle. However, Davis teaches this limitation, as 
discussed in the rejection for Claim 1 . 

However, Kohn and Davis do not teach wherein a next instruction from one of said 
plurality of programs is not provided to said pipeline until a previous instruction of said one of 
said plurality of programs has completed and wherein no no-op is inserted into the pipeline for 
the purpose of ensuring that said next instruction is not provided to said pipeline until said 
previous instruction has completed. However, Applicant's disclosure describes inserting no-ops 
into instruction stream or retarding launching of new programs until first program finishes (p. 17, 
lines 8-11). So, when no no-op is inserted, that means first instruction is completed and 
execution of second instruction can begin. Krishna teaches local scheduling circuitry stops main 
scheduler from issuing selected operation if latency of another operation would create conflict 
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with main scheduler issuing selected operation (col. 2, lines 56-60). So, it is ensured first 
instruction is completed before beginning execution of second instruction. So, next instruction is 
not provided to the pipeline until a previous instruction has completed. Operation is executed if 
no no-op is inserted into pipeline (col. 5, lines 36-38; col. 2, lines 41-45). So, when no no-op is 
inserted into pipeline, this ensures first instruction is completed before beginning execution of 
second instruction. So, when no-op is inserted, this means that operation is not ready to be 
executed. When associated operation which is to be executed is inserted, there is no no-op 
inserted, this means that associated operation is ready to be executed, meaning first instruction is 
completed and it is now okay to execute associated operation. So, no no-op is inserted for 
purpose of ensuring first instruction is completed before beginning execution of second 
instruction. This would be obvious for the same reasons given in the rejection for Claim 14. 
24. Claim 15 is rejected under 35 U.S.C. 103(a) as being unpatentable over Kohn 
(US005261063A), Davis (US005357617A), and Krishna (US006161 173A) in view of 
Eickemeyer (US006061710A). 

Kohn, Davis, and Krishna are relied upon for teachings as discussed above relative to 
Claim 14. 

However, Kohn, Davis, and Krishna do not teach further including the step of assigning a 
program counter to each of said N programs. However, Eickemeyer teaches further including 
the step of assigning a program counter to each of said N programs (col. 8, lines 57-60). This 
would be obvious for the same reasons given in the rejection for Claim 6. 
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25. Claim 18 is rejected under 35 U.S.C. 103(a) as being unpatentable over Kohn 
(US005261063A), Davis (US005357617A), and Krishna (US006161 173A) in view of Nguyen 
(US005961628A). 

Claim 18 is similar in scope to Claim 8, and so is rejected under the same rationale. 

26. Claim 33 is rejected under 35 U.S.C. 103(a) as being unpatentable over Joffe 
(US006330584B1) in view of Krishna (US006161 173A). 

Joffe teaches a method, by a programmable processor, of executing instructions from 
plurality of programs (col. 2, lines 66-67), assigning 1st output register slot to first of plurality of 
programs wherein each of the plurality of programs has plurality of instructions (col. 1, line 62- 
col. 2, line 1 1). If wait signal is asserted, instruction execution is not completed, so instruction 
will be executed again until wait signals are deasserted, then next instruction can be executed 
(col. 10, lines 20-24, 31-32), and process repeats until all instructions have been executed. Joffe 
teaches if task attempts to access unavailable resource, task is suspended. When resource 
becomes available, suspended task is resumed, and instruction accessing resource is re-executed. 
Task does not get access to same resource until after every other task sharing resource has 
finished accessing resource (col. 2, lines 29-39). If Wait signal is asserted, instruction execution 
is not completed and PC register is frozen, but task remains active, and instruction will be 
executed again starting next clock cycle. If Suspend/Wait signals are deasserted, the PC register 
is changed to point to next instruction (col. 10, lines 19-32). So, Joffe teaches checking to see if 
Suspend/Wait signals are desasserted, which indicates instruction execution is completed (col. 
10, lines 19-32) and resource has become available (col. 2, lines 32-34), and resource becomes 
available after every other task sharing resource has finished accessing resource (col. 2, lines 29- 
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39). So, Joffe teaches checking to see if all of the programs are completed. So, Joffe teaches 
executing instructions of first program until program is completed; loading output of first 
program into its reserved space when first program is completed (col. 9, lines 26-41); checking 
to see if all of plurality of programs are completed (col. 2, lines 35-39). Wait signal is asserted if 
register is not available, and wait signal is deasserted if register is available for new instruction 
(col. 10, lines 20-24, 31-32). Each task (program) has separate register and separate flags (col. 2, 
lines 11-13). So, second output register slot is assigned to second program. If task attempts to 
access unavailable resource, task is suspended. When resource becomes available, suspended 
task is resumed, and instruction accessing resource is executed (col. 2, lines 29-34). So, Joffe 
teaches checking to see if second register slot is available to assign to second program from 
plurality of programs when first program is completed; checking to see if one or more 
instructions are available when at least one of the programs is not completed (col. 2, lines 35-39). 

However, Joffe does not teach placing no-op when no more instructions are available. 
However, Krishna teaches information in each entry describes either no-op or associated 
operation which is to be executed (col. 5, lines 36-38). So, when there is no-op, that means that 
no more instructions are available. This would be obvious for reasons for Claim 14. 

Allowable Subject Matter 
27. Claims 25-3 1 are allowed, for reasons given in the Office Action dated March 28, 2007. 
Conclusion 

Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to JONI HSU whose telephone number is (571)272-7785. The 
examiner can normally be reached on M-F 8am-5pm. 
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If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Kee Tung can be reached on 571-272-7794. The fax phone number for the 
organization where this application or proceeding is assigned is 571-273-8300. 

Information regarding the status of an apphcation may be obtained from the Patent 
Application Information Retrieval (PAIR) system. Status information for published applications 
may be obtained from either Private PAIR or Public PAIR. Status information for unpublished 
applications is available through Private PAIR only. For more information about the PAIR 
system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR 
system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would 
like assistance from a USPTO Customer Service Representative or access to the automated 
information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. 

JH 

/Joni Hsu/ 

Primary Examiner, Art Unit 2628 



